Theory of Phase Transition in the Evolutionary Minority Game 
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We discover the mechanism for the transition from self-segregation (into opposing groups) to 
clustering (towards cautious behaviors) in the evolutionary minority game (EMG). The mechanism 
is illustrated with a statistical mechanics analysis of a simplified EMG involving three groups of 
agents: two groups of opposing agents and one group of cautious agents. Two key factors affect the 
population distribution of the agents. One is the market impact (the self- interaction), which has 
been identified previously. The other is the market inefficiency due to the short-time imbalance in 
the number of agents using opposite strategies. Large market impact favors "extreme" players who 
choose fixed strategies, while large market inefficiency favors cautious players. The phase transition 
depends on the number of agents (N), the reward-to-fine ratio (R), as well as the wealth reduction 
threshold (d) for switching strategy. When the rate for switching strategy is large, there is strong 
clustering of cautious agents. On the other hand, when N is small, the market impact becomes 
large, and the extreme behavior is favored. 
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Complex adaptive systems are ubiquitous in social, bi- 
ological and economic sciences. In these systems agents 
adapt to the changes in the global environment, which 
are induced by the actions of the agents themselves. The 
main theme in the study of complex systems is to un- 
derstand the emergent properties in the global dynamics. 
Of particular interest are the systems in which the agents 
have no direct interaction but compete to be in the mi- 
nority; they modify their behaviors (strategies) based on 
the past experiences 
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Examples of such systems 
include financial markets [3j, rush-hour traffic and 
ecological systems. In the context of demand and supply 
in economic systems, the idea of the minority game is 
particularly relevant. If the demand is larger than the 
supply, the price of the goods will increase; this benefits 
the sellers who are in the minority. Many agent based 
models of economic systems and financial markets indeed 
incorporate the essence of the minority game. 

In this letter we shall focus on the EMG proposed by 
Johnson, et al Q. The model is defined as follows. There 
are N (odd number) agents. At each round they choose to 
enter Room (sell a stock or choose route A) or Room 
1 (buy a stock or choose route B). At the end of each 
round the agents in the room with fewer agents (in the 
minority) win a point; while the agents in the room with 
more agents (in the majority) lose a point. The winning 
room numbers (0 or 1) are recorded, and they form a his- 
torical record of the game. All agents share the common 
memory containing the outcomes from the most recent 
occurrences of all 2 m possible bit strings of length to. 
The basic strategy is derived from the common memory 
and is changing dynamically. Given the current to — bit 
string, the basic strategy is simply to choose the win- 
ning room number after the most recent pattern of same 
?7i — bit string in the historical record. To use the basic 



strategy is thus to follow the trend. In the EMG each 
agent is assigned a probability p: he will adopt the ba- 
sic strategy with probability p and adopt the opposite 
of the basic strategy with probability I — p. The agents 
with p = or p — 1 are "extreme" players, while the 
agents with p — 1/2 are cautious players. The game and 
its outcomes evolve as less successful agents attempt to 
modify their p values. This is achieved by allowing the 
agents with the accumulated wealth less than d (d < 0) 
to change their p values. In the original EMG model, the 
new p value is chosen randomly in the interval of width 
Ap centered around its original p value. His wealth is 
reset to zero and the game continues. Thus in the EMG 
the agents constantly learn from mistakes and adapt their 
strategies as the game evolves. 

A remarkable feature emerges from the study of the 
EMG: the agents self-segregate into two opposing ex- 
treme groups with p ~ and p ~ 1 0, E3, 0]- This 
conclusion is rather robust; it does not depend on N, d, 
Ap, to, or the initial distribution of p. The final distri- 
bution always has symmetric U-shapc. This leads to the 
following conclusion: in order to succeed in a completive 
society the agent must take extreme positions (either al- 
ways follows a basic strategy or goes against it). This 
behavior can be explained by the market impact of the 
agents' own actions which largely penalizes the cautious 
agents 0- By introducing the reward-to-fine ratio R, 
Hod and Nakar found that the above conclusion is only 
robust when R > 1. When R < 1 there is a tendency to 
cluster towards cautious behaviors and the distribution of 
the p value, P{p), may evolve to an inverted-U shape with 
the peak at the middle. In some ranges of the parameters 
M-shape distributions are also observed. To explain the 
clustering of cautious agents, Hod gives a phenomeno- 
logical theory relating the accumulated wealth reduction 
to a random walk with time-dependent oscillating prob- 
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FIG. 1: The distribution P(p) for R=0.971 and d = -4. A 
set of values of N = 101, 735, 1467, 2935, 5869, and 10001 are 
used. The distribution is obtained by averaging over 100,000 
time steps and 10 independent runs 



abilities [lo|. However, the dynamical mechanism for the 
phase transition is not clear. This letter aims to present 
such mechanism from the analysis based on statistical 
mechanics. 

We first present our numerical results which show 
that the transition from self-segregation to clustering is 
generic for R < 1. We have performed extensive sim- 
ulations of EMG for a wide range of the values of the 
parameters, N, R, and d. The transition depends on 
all three parameters, N, R, and d. Figure 1 shows 
the distribution P(p) for R = 0.971, d = -4, and 
N = 101,735,1467,2935,5869 and 10001. For a given 
R (< 1) and d, we observe the transition from self- 
segregation to clustering as the number of agents N in- 
creases. The shape of the distribution P(p) changes from 
a U-shape to an inverted U shape (near the transition 
point P(p) has M-shape). The standard deviation a p of 
the distribution decreases as N increases. We define the 
critical value N c as the value of N when a p equal to the 
standard deviation of the uniform distribution, i.e. when 
(Tp = f (p— l/2) 2 P(p)dp equal to 1/12. Our results can 
be summarized by the general expression for the criti- 
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FIG. 2: The critical value \d c 
0.5, 0.6, 0.7, 0.8, 0.9, 0.94, and 0.975. 
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, where A is a constant of order 



A{l-R) 

one. Alternatively one might view the transition by vary- 
ing d with fixed N and R. As \d\ increases the system 
changes from clustering to self-segregation. The critical 
value is then given by \d c \ = A(l — R)y/~N. Figure 2 plots 
N c vs \d\ for various R. When R — ► 1 the clustering only 
occurs for very large TV or very small \d\. At R = 1 the 
clustering disappears and the behavior of self-segregation 
becomes robust. 

Hod and Nakar explain that R < 1 corresponds to diffi- 
cult situations (tough environments) in which the agents 
tend to be confused and indecisive and thus become cau- 



tious. We find that the rate of strategy switching (which 
depends on both R and d) affect the distribution of the 
agents more directly. For R < 1 the agent switches its 
strategy every 2|d|/(l — R) time steps on average. So 
when R or |d| is small, the agents have less patience and 
switch their strategies more frequently; this, as we ex- 
plain below, causes large market inefficiency and thus 
favors cautious agents. It is the rapid adaptation that 
makes the agents "confused" and "indecisive". On the 
other hand, when the number of agents is small, the mar- 
ket impact becomes large. Take for example a population 
consists of only three agents with p = 0, 1/2, and 1 re- 
spectively. The cautious agent (with p = 1/2) always 
loses because he is always in the majority, while the ex- 
treme agents are in the majority half of the times. In this 
case the cautious agent experiences the full market im- 
pact of his own action. Indeed our data show that when 
N is small enough the self-segregation into extreme be- 
haviors dominates. 

We now show that the mechanism for clustering around 
p = 1/2 and the transition to self-segregation can be 
understood from a simplified model in which p takes only 
three possible values p = 0,1/2, and 1. The agents in 
Group (with p = 0) makes the opposite decision from 
the agents in Group 1 (with p = 1). We denotes the group 
with p — 1/2 "Group m" . The probability of winning 
only depends on Nq, N m , N±, which are the respective 
numbers of agents in Group 0, m, and 1. 

We begin by evaluating the average wealth reduction 
for the agents in each of the three groups. Let n be the 
number of agents in Group m making the same decision 
(let us call it decision A) as those in Group (N m — n 
will then be the number of agents in Group m making 
the same decision (decision B) as those in Group 1). If 
N Q + n < (N m -n) + Ni, or n < N m /2+{N x - N )/2, the 
agents making decision A will win; when n > N m /2 + 
(Ni —Nq)/2, the agents making decision B will win. The 
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winner has its wealth increased by R, while the loser has 
its wealth reduced by 1. With Nq, N m , and Ni fixed, the 
probability of winning depends on n. 

When N m ^> 1, the distribution of n can be 
approximated by a Gaussian distribution P(n) = 
-exp(-(n-iV m /2) 2 /(2 ( T r 2 „)), where a m = sfN~/2. 



Given the distribution, one can write down the average 
wealth change for the agents in Group 0, 



Aw = R 



N m /2+N d /2 



P{n)dn 



N„ 



P(n)dn, 



N m /2+N d /2 



where Nd = N\ — No- This can be rewritten in term of 
the error function erf(a;) = J Q X e~* dt, 



Aw = - 



1-R 1 + R 



erf 



N, 



\ 2v / 2o'r? 



(1) 



Similarly we can derive the average wealth change for the 
agents in Group 1, 



Awi — — — - — — — ^ — erf f 



\ 2-\/2cr„ 



(2) 



Since the number Nq and N\ are fluctuating, and on 
the average iVo and Ni should be the same, we can av- 
erage out the short time fluctuations in Nd- This allows 
us to find out how the agents in the "extreme" groups 
compare with the cautious agents in Group m in the long 
run. The average wealth change of the agents in Group 
and 1 is given by Aw e = (N Aw Q + NxAwi)/(N + Nt). 
Substituting the expressions for Awq and Awx, we have 



. 1-R 1 + R N d 

l\w„ = crt 

2 2 No + m 



N d 



2%/2o 



(3) 



Note that the second term in Aw e , which is due to the 
fluctuations in Nd, is always negative (since erf(x) is an 
odd function). When No ^ N\, the winning probabili- 
ties for making decision A and decision B are not equal, 
and the market is not efficient. Thus this term can be 
interpreted as the cost due to market inefficiency. Large 
market inefficiency on average penalize the players taking 
"extreme" positions more. 

For the agents in Group m, if n < N m /2 + N d /2, then 
n agents in the group win, while N m — n agents in the 
group lose. On the other hand, if n > N m /2 + N d /2, then 
N m — n agents in the group win, but n agents lose. We 
need to take these two cases into account when evaluating 
the average. 



1 

Nr. 



N m /2+N d /2 



N„ 



(Rn - (N m - n))P{n)dn 



(R(N m - n) - n)P{n)dn 



N m /2+N d /2 



After a few algebraic steps, we arrive at 



At 



= -(l-R)/2- 



_1±R_ 

V2^\r 



exp(-^J/(2iV m )) (4) 



The first term in A m is the same as that in A e . The 
second term can be interpreted as the market impact • 
The magnitude of the term is in fact the largest when 
Nd = 0. Large market impact (self-interaction) penalizes 
the cautious players; their own decisions increase their 
chances of being in the majority and hence their chances 
of losing. 

To determine the transition from clustering to self- 
segregation, we need to calculate the distribution of Nd 
which allows us to evaluate Aw e and Aw m . Let us de- 
note the change in Nd in one time step as 5N. On av- 
erage 8N = 2N /(\d\/({l - R)/2)) = N Q {1 - R)/\d\; this 
is the average number of extreme agents switching their 
strategies per time step (adaptation rate). The factor 2 
is included because the agent only loses about half of the 
times. |<2|/((1 — R)/2) is the average time step taken be- 
fore the wealth threshold is reached. The dynamics of N d 
can be described as a random walk with mean reversal 
(there is a higher probability moving towards iV^ = 0). 
The individual step of the walk is given by ±6N. The 
probability for changing from Nd to Nd + SN is given by 
W+(Nd), and the probability for changing to Nd — 5N 
is given by where W ± = ±[1 T ed(N d /(2y/2a m )]. 
The steady state probability distribution Q(N d ) for N d 
should satisfy 

Q(N d ) = W^(Nd + SN)Q(N d + SN) 

+W+{N d -5N))Q{N d ~5N). (5) 

For small SN one can convert the above equation to a 
differential equation. The solution of Q(Nd) is given by 

2 r Nd n 
Q(7V d )«exp(-— / erf(— =—)dn) (6) 
oN J 2V2a m 

Now we average Aw e and Aw m over the distribution of 
Q(Nd). We can easily obtain that 



1-R (1 + R) 



SN 



2 2 2(A + AM 

Awm, on the other hand, is given by 
1 + R 



(7) 



Aw r< 



-{l-R)/2 



V2tFNZ 



< exp(-JVj/(2iV m )) >, 



where the average is over the distribution Q(Nd). This 
can be approximated as 



AWn 



1-R 1 + R 



1 



^7T y/N, 



since in the range Nd < <J m , where the main contri- 
bution to the average comes from, Q(N d ) can be well 
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approximated by a Gaussian distribution centered at 
zero with width = y^^^fomfiN ' . At the critical 
point, N Q = N-i = N m = N/3, and Aw e = Aw m . It 
is easy to verify that this occurs when SN ~ \/N m . 
As SN = Nq(1 — R)/\d\, the crossover value for \d\ is 
|d c | = Aq(1 — R)yN, where Aq is a constant of the order 
one. 

In the above derivation we simply use the averaged 
value for SN. This underestimates the magnitude of 
Aw e . For R close to 1, the strategy switching in the 
"extreme" group is rather intermittent. There are no 
agents switching strategy for many time steps, but in a 
single step many agents in the group switch strategies. 
A loss at a single round, for example, will not make the 
agents in the extreme group to switch strategy if they 
had won in the previous two rounds. We can take this 
intermittency into account, by introducing the probabil- 
ity z that strategy switching occurs in the extreme group 
after it loses. We leave out the case SN = 0, since it 
does not affect the distribution of Nj. The average SN 
is now Nq(1 — R)/(z\d\). The crossover value for d is 
then given by \d c \ = (A Q /z)(l - R)VN = A(l - R)VF. 
If SN is close to its averaged value and z ~ 1, A is 
of the order one. The broader the distribution of SN 
and the larger the intermittency in strategy switching 
among the agents in the extreme groups, the larger the 
value of A. One can estimate the upper bound for A 
as follows. The probability z and SN are related to the 
wealth distribution of the agents in the extreme groups. 
The minimum width of the wealth distribution is \d\, so 
SN < N/\d\. The upper bound in d c is thus obtained 
with SN = N/d and z = 1 - R; this leads to d c ~ V^V, 
or A ~ 1/(1 — R). Figure 3 shows A vs N for various R 
values. One can sec that A becomes independent of N 
for sufficiently large N (this means that \d c \ oc vQV holds 
well numerically) . The value of A indeed approaches the 
upper bound Aq/{1 — R) for the three-group EMG when 
N > 1/(1 — R) 2 . This can be understood by the follow- 
ing simple argument: The width of the wealth distribu- 
tion is close to \d\ when |d| is greater than the wealth 
fluctuation, which is roughly y/d/(l — R), given that the 
average time for strategy switching is about |d|/(l — R). 
Thus when |d| > 1/(1- R) or N > 1/(1- R) 2 , the upper 
bound for A is reached. However, this is likely to be the 
unique feature for the three-group EMG model. For the 
original EMG the value of A is of order one for a wide 
range of R, as can also be seen from Figure 3. 

The theory can be generalized to the original EMG 
model by generalizing the definition of Nd to Nd =2(p — 
1/2), where p is the average of the p values among all 
the agents at a given time step. The market inefficiency 
is again measured by the fluctuation in Nd- Consider 
the version in which the agent choose a new p randomly 
when its wealth is below d, then we can argue that SN 
(the average change in Nd) is again given by SN ~ N(l — 
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FIG. 3: A vs N for various values of R. The results from 
the three-group EMG and the original EMG with random 
redistribution are shown 



R)/\d\. So we have \d c \ = A (l - R)VN; this works well 
because the fluctuation of SN is likely to be much smaller 
in the original model than in the three-group model. We 
can also understand the version of the model in which the 
new p value is chosen in the interval of width Sp around 
the old p value. Since a smaller Sp leads to a smaller 
SN, the cost due to market inefficiency is reduced. This 
favors the "extreme" agents (\d c \ is smaller for a smaller 
Sp); it is consistent with the results obtained in Ref. [ll| . 
Ref. |2j found that the periodic boundary condition used 
in the redistribution of the p value favors clustering. This 
is also not surprising. When the boundary condition is 
periodic in p, SN is effectively increased, because some 
p = agents can switch to p = 1 agents, even when Sp is 
small. 

In conclusion, we have derived a general formalism for 
studying the transition from clustering to self-segregation 
based on the statistical mechanics of a simplified three- 
group model. We find that frequent strategy switching 
leads to market inefficiency which favors the clustering of 
cautious agents. A general expression relating the num- 
ber of agents, the wealth threshold, and the reward-to- 
fine ratio at the critical point is derived. This expression 
is found to be equally valid for the general EMG. 
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